Introduction to HPC

Slurm Scheduler and Resource Manager

Manuel Holtgrewe

Berlin Institute of Health at Charité

Session Overview

Aims

  • Understand the role of Slurm as a scheduler and resource manager.
  • Learn how to use Slurm to run jobs on the cluster and …
  • … how to interrogate Slurm for cluster and job status.
  • Using Conda and Apptainer for portable software installations.

Actions

  • Submit interactive and batch jobs.
  • Write Slurm job scripts.
  • Query slurm with squeue, sinfo and scontrol.
  • Using Conda and Apptainer

Documentation Resources

🤓 Google/Bing will help to find more

👍 User Forum at https://hpc-talk.cubi.bihealth.org!

Your Experience? 🤸

  • Do you already
    • have experience with HPC?
    • know Slurm?
    • know another job scheduler or resource manager?

Slurm

  • Introduction
  • Running Interactive and Batch Jobs
  • Querying job and cluster status

What is a Scheduler?

Resource Manager

  • Slurm keeps a ledger of our node and job resources
    • available CPUs, memory, GPUs on each node
    • jobs and their required resources
    • currently running jobs and their used resources

Job Scheduler

  • Slurm manages a schedule of submitted jobs
    • Freshly submitted jobs are subjected to quick scheduling
    • Periodically, Slurm will run full backfill scheduling

Our First Interactive Job: Submission 🎬

holtgrem_c@hpc-login-1$ srun --pty --time=2:00:00 --partition=training \
    --mem=10G --cpus-per-task=1 bash -i
srun: job 14629328 queued and waiting for resources
srun: job 14629328 has been allocated resources
holtgrem_c@hpc-cpu-141$
  • start interactive job with srun
    • –pty connects the job’s standard output and error streams to your shell
    • –time=2:00:00 makes your job last up to two hours
    • –partition=training submits into the training partition
    • –mem=10G allocates 10GB of RAM to your job
    • –cpus-per-task=1 allocates one thread for our task
    • bash -i is the command to run (interactive bash)
  • now: look at your job in another shell: 🤸
    • squeue -u $USER
    • scontrol show job 14629328

Looking at squeue 🤸

What is the output of squeue?

holtgrem_c@hpc-cpu-141$ squeue -u $USER
   JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)
14629328  training     bash holtgrem  R       1:40      1 hpc-cpu-141

More info with --long:

Tue Jul 11 15:21:13 2023
   JOBID PARTITION     NAME     USER    STATE       TIME TIME_LIMI  NODES NODELIST(REASON)
14629328  training     bash holtgrem  RUNNING       3:20   2:00:00      1 hpc-cpu-141

👉 Slurm Documentation: squeue

Looking at scontrol show job 🤸

Let us look at scontrol show job 14629328

holtgrem_c@hpc-cpu-141$ scontrol show job 14629328
JobId=14629328 JobName=bash
   UserId=holtgrem_c(100131) GroupId=hpc-ag-cubi(1005272) MCS_label=N/A
   Priority=661 Nice=0 Account=hpc-ag-cubi QOS=normal
   JobState=RUNNING Reason=None Dependency=(null)
   Requeue=1 Restarts=0 BatchFlag=0 Reboot=0 ExitCode=0:0
   RunTime=00:06:37 TimeLimit=02:00:00 TimeMin=N/A
   SubmitTime=2023-07-11T15:17:37 EligibleTime=2023-07-11T15:17:37
   AccrueTime=2023-07-11T15:17:37
   StartTime=2023-07-11T15:17:53 EndTime=2023-07-11T17:17:53 Deadline=N/A
   SuspendTime=None SecsPreSuspend=0 LastSchedEval=2023-07-11T15:17:53 Scheduler=Backfill
   Partition=training AllocNode:Sid=hpc-login-1:3631083
   ReqNodeList=(null) ExcNodeList=(null)
   NodeList=hpc-cpu-141
   BatchHost=hpc-cpu-141
   NumNodes=1 NumCPUs=1 NumTasks=1 CPUs/Task=1 ReqB:S:C:T=0:0:*:*
   TRES=cpu=1,mem=10G,node=1,billing=1
   Socks/Node=* NtasksPerN:B:S:C=0:0:*:* CoreSpec=*
   MinCPUsNode=1 MinMemoryNode=10G MinTmpDiskNode=0
   Features=(null) DelayBoot=00:00:00
   OverSubscribe=OK Contiguous=0 Licenses=(null) Network=(null)
   Command=bash
   WorkDir=/data/cephfs-1/home/users/holtgrem_c
   Power=

👉 Slurm Documentation: scontrol

Observing a job’s states 🤸 (1/4)

The job in PENDING state

holtgrem_c@hpc-cpu-141$ scontrol show job 14629381
JobId=14629381 JobName=bash
   UserId=holtgrem_c(100131) GroupId=hpc-ag-cubi(1005272) MCS_label=N/A
   Priority=661 Nice=0 Account=hpc-ag-cubi QOS=normal
   JobState=PENDING Reason=Priority Dependency=(null)
   Requeue=1 Restarts=0 BatchFlag=0 Reboot=0 ExitCode=0:0
   RunTime=00:00:00 TimeLimit=02:00:00 TimeMin=N/A
   SubmitTime=2023-07-11T15:26:33 EligibleTime=2023-07-11T15:26:33
   AccrueTime=Unknown
   StartTime=Unknown EndTime=Unknown Deadline=N/A
   SuspendTime=None SecsPreSuspend=0 LastSchedEval=2023-07-11T15:26:33 Scheduler=Main
   Partition=short AllocNode:Sid=hpc-login-1:3644832
   ReqNodeList=(null) ExcNodeList=(null)
   NodeList=
   NumNodes=1 NumCPUs=1 NumTasks=1 CPUs/Task=1 ReqB:S:C:T=0:0:*:*
   TRES=cpu=1,mem=10G,node=1,billing=1
   Socks/Node=* NtasksPerN:B:S:C=0:0:*:* CoreSpec=*
   MinCPUsNode=1 MinMemoryNode=10G MinTmpDiskNode=0
   Features=(null) DelayBoot=00:00:00
   OverSubscribe=OK Contiguous=0 Licenses=(null) Network=(null)
   Command=bash
   WorkDir=/data/cephfs-1/home/users/holtgrem_c
   Power=

Observing a job’s states 🤸 (2/4)

The job while running on a node:

holtgrem_c@hpc-cpu-141$ scontrol show job 14629381
JobId=14629381 JobName=bash
   UserId=holtgrem_c(100131) GroupId=hpc-ag-cubi(1005272) MCS_label=N/A
   Priority=661 Nice=0 Account=hpc-ag-cubi QOS=normal
   JobState=RUNNING Reason=None Dependency=(null)
   Requeue=1 Restarts=0 BatchFlag=0 Reboot=0 ExitCode=0:0
   RunTime=00:05:04 TimeLimit=02:00:00 TimeMin=N/A
   SubmitTime=2023-07-11T15:26:33 EligibleTime=2023-07-11T15:26:33
   AccrueTime=2023-07-11T15:26:33
   StartTime=2023-07-11T15:26:53 EndTime=2023-07-11T17:26:53 Deadline=N/A
   SuspendTime=None SecsPreSuspend=0 LastSchedEval=2023-07-11T15:26:53 Scheduler=Backfill
   Partition=short AllocNode:Sid=hpc-login-1:3644832
   ReqNodeList=(null) ExcNodeList=(null)
   NodeList=hpc-cpu-144
   BatchHost=hpc-cpu-144
   NumNodes=1 NumCPUs=1 NumTasks=1 CPUs/Task=1 ReqB:S:C:T=0:0:*:*
   TRES=cpu=1,mem=10G,node=1,billing=1
   Socks/Node=* NtasksPerN:B:S:C=0:0:*:* CoreSpec=*
   MinCPUsNode=1 MinMemoryNode=10G MinTmpDiskNode=0
   Features=(null) DelayBoot=00:00:00
   OverSubscribe=OK Contiguous=0 Licenses=(null) Network=(null)
   Command=bash
   WorkDir=/data/cephfs-1/home/users/holtgrem_c
   Power=

Observing a job’s states 🤸 (3/4)

The job “just” after being terminated:

holtgrem_c@hpc-cpu-141$ scontrol show job 14629381
JobId=14629381 JobName=bash
   UserId=holtgrem_c(100131) GroupId=hpc-ag-cubi(1005272) MCS_label=N/A
   Priority=661 Nice=0 Account=hpc-ag-cubi QOS=normal
   JobState=COMPLETED Reason=None Dependency=(null)
   Requeue=1 Restarts=0 BatchFlag=0 Reboot=0 ExitCode=0:0
   RunTime=00:07:52 TimeLimit=02:00:00 TimeMin=N/A
   SubmitTime=2023-07-11T15:26:33 EligibleTime=2023-07-11T15:26:33
   AccrueTime=2023-07-11T15:26:33
   StartTime=2023-07-11T15:26:53 EndTime=2023-07-11T15:34:45 Deadline=N/A
   SuspendTime=None SecsPreSuspend=0 LastSchedEval=2023-07-11T15:26:53 Scheduler=Backfill
   Partition=short AllocNode:Sid=hpc-login-1:3644832
   ReqNodeList=(null) ExcNodeList=(null)
   NodeList=hpc-cpu-144
   BatchHost=hpc-cpu-144
   NumNodes=1 NumCPUs=1 NumTasks=1 CPUs/Task=1 ReqB:S:C:T=0:0:*:*
   TRES=cpu=1,mem=10G,node=1,billing=1
   Socks/Node=* NtasksPerN:B:S:C=0:0:*:* CoreSpec=*
   MinCPUsNode=1 MinMemoryNode=10G MinTmpDiskNode=0
   Features=(null) DelayBoot=00:00:00
   OverSubscribe=OK Contiguous=0 Licenses=(null) Network=(null)
   Command=bash
   WorkDir=/data/cephfs-1/home/users/holtgrem_c
   Power=

Observing a job’s states 🤸 (4/4)

After some time, the job is not known to the controller any more…

holtgrem_c@hpc-cpu-141$ scontrol show job 14629381
slurm_load_jobs error: Invalid job id specified

… but we can still get some information from the accounting (for 4 weeks) …

holtgrem_c@hpc-cpu-141$ sacct -j 14629381
JobID           JobName  Partition    Account  AllocCPUS      State ExitCode
------------ ---------- ---------- ---------- ---------- ---------- --------
14629381           bash      short hpc-ag-cu+          1  COMPLETED      0:0
14629381.ex+     extern            hpc-ag-cu+          1  COMPLETED      0:0
14629381.0         bash            hpc-ag-cu+          1  COMPLETED      0:0

You can use sacct -j JOBID --long | less -SR to see all available accounting information.

Our First Batch Job 🎬 (1/5)

holtgrem_c@hpc-login-1$ cat >first-job.sh <<"EOF"
#!/usr/bin/bash

echo "Hello World"
sleep 1min
EOF
holtgrem_c@hpc-login-1$ sbatch first-job.sh
sbatch: You did not specify a running time. Defaulting to two days.
sbatch: routed your job to partition medium
sbatch:
Submitted batch job 14629473

Our First Batch Job 🎬 (2/5)

Sadly, it failed:

holtgrem_c@hpc-login-1$ scontrol show job 14629473
JobId=14629473 JobName=first-job.sh
   UserId=holtgrem_c(100131) GroupId=hpc-ag-cubi(1005272) MCS_label=N/A
   Priority=761 Nice=0 Account=hpc-ag-cubi QOS=normal
   JobState=FAILED Reason=NonZeroExitCode Dependency=(null)
   Requeue=1 Restarts=0 BatchFlag=1 Reboot=0 ExitCode=1:0
   RunTime=00:00:00 TimeLimit=2-00:00:00 TimeMin=N/A
   SubmitTime=2023-07-11T15:44:31 EligibleTime=2023-07-11T15:44:31
   AccrueTime=2023-07-11T15:44:31
   StartTime=2023-07-11T15:44:54 EndTime=2023-07-11T15:44:54 Deadline=N/A
   SuspendTime=None SecsPreSuspend=0 LastSchedEval=2023-07-11T15:44:54 Scheduler=Backfill
   Partition=medium AllocNode:Sid=hpc-login-1:3644832
   ReqNodeList=(null) ExcNodeList=(null)
   NodeList=hpc-cpu-219
   BatchHost=hpc-cpu-219
   NumNodes=1 NumCPUs=1 NumTasks=1 CPUs/Task=1 ReqB:S:C:T=0:0:*:*
   TRES=cpu=1,mem=1G,node=1,billing=1
   Socks/Node=* NtasksPerN:B:S:C=0:0:*:* CoreSpec=*
   MinCPUsNode=1 MinMemoryCPU=1G MinTmpDiskNode=0
   Features=(null) DelayBoot=00:00:00
   OverSubscribe=OK Contiguous=0 Licenses=(null) Network=(null)
   Command=/data/cephfs-1/home/users/holtgrem_c/first-job.sh
   WorkDir=/data/cephfs-1/home/users/holtgrem_c
   StdErr=/data/cephfs-1/home/users/holtgrem_c/slurm-14629473.out
   StdIn=/dev/null
   StdOut=/data/cephfs-1/home/users/holtgrem_c/slurm-14629473.out
   Power=

Our First Batch Job 🎬 (3/5)

Troubleshooting our job failure:

holtgrem_c@hpc-login-1$ cat /data/cephfs-1/home/users/holtgrem_c/slurm-14629473.out
Hello World
sleep: invalid time interval ‘1min’
Try 'sleep --help' for more information.

Our First Batch Job 🎬 (4/5)

More troubleshooting hints:

  1. scontrol | grep Reason
    • NonZeroExitCode? Timeout? Out of memory?
  2. Does WorkDir exist and do you have access?
    • Try: cd $WorkDir
  3. Look at the StdOut/StdErr log files, if any.
  4. Look at sacct -j 14629473 --format=JobID,State,ExitCode,Elapsed,MaxVMSize to look for hints regarding running time/memory (VM) size

Our First Batch Job 🎬 (5/5)

holtgrem_c@hpc-login-1$ cat >first-job.sh <<"EOF"
#!/usr/bin/bash

echo "Hello World"
sleep 1m
EOF
holtgrem_c@hpc-login-1$ sbatch first-job.sh
sbatch: You did not specify a running time. Defaulting to two days.
sbatch: routed your job to partition medium
sbatch:
Submitted batch job 14629474

👉hpc-docs: sbatch
👉hpc-docs: srun

Resource Allocation with srun/sbatch

We can explicitely allocate resources with the srun and sbatch command lines:

  • --job-name=MY-JOB-NAME: explicit naming
  • --time=D-HH:MM:SS: max running time
  • --partition=PARTITION: partition
  • --mem=MEMORY: allocate memory, use <num>G or <num>M
  • --cpus-per-task=CORES: number of cores to allocate

👉 Slurm Documentation: sbatch

Resource Allocation in Job Scripts

holtgrem_c@hpc-login-1$ cat >first-job.sh <<"EOF"
#!/usr/bin/bash

#SBATCH --job-name=tired-but-extravagent
#SBATCH --time=0:05:00
#SBATCH --partition=short
#SBATCH --mem=2G
#SBATCH --cpus-per-task=4

echo "I will waste 2GB of RAM and 4 corse for 1 min..."
sleep 1m
EOF

Your Turn: Writing Job Scripts 🤸

Write a job script that …

  1. allocates minimal memory for sleep 1m (hint: how can you figure out the maximal memory used?)
  2. writes separate stdout and stderr files (where could this be useful?)
  3. is called job-1.sh that triggers job-2.sh on completion (is this useful? dangerous?)

Use online resources to figure out the right command line parameters.

Your Turn: 👀 Staring at the Scheduler 🤸

Use the following commands and use the online help and man $command to figure out the output:

  • sdiag
  • squeue
  • sinfo
  • scontrol show node NODE
  • sprio -l -S -y

Your Turn: 👓 Staring at the Scheduler 🤸

Use the following commands and use the online help and man $command to figure out the output:

  • sdiag
    • quick look to see scheduler status and load
  • squeue
    • investigate current queue status
  • sinfo
    • get an overview of nodes’ health and load
  • scontrol show node NODE
    • look at node health and load
  • sprio -l -S -y
    • look at current scheduler priorities for jobs

Your Turn: Make it Fail! 🤸

Provoke the following situations:

  1. Work directory does not exist.
  2. Work directory exists but you have no access.
  3. Stdout/stderr files cannot be written
  4. Too many cores allocated (try: 100)
  5. Job needs too much memory (allocate 500MB, then use this)
  6. Job runs into timeout

In each case, look at scontrol/sacct output and look at log files.

Slurm Partitions (1)

  • The nodes in the cluster are assigned to partitions.
  • Partitions are used to
    • separate nodes with special hardware (high memory, GPU)
    • provide access to different quality of service
  • By default, all jobs go to standard
    • they will then be routed to the appropriate partition based on their resource requirements
    • you can override this with --partition=PARTITION on the command line
  • You must specify the the following partitions explicitely: highmem, gpu, mpi

Slurm Partitions (2)

Available by default:

  • debug - very short jobs for debugging
  • short - allows many jobs up to 4h
  • medium - up to 512 cores per user, up to 7days
  • long - up to 64 cores per user, up to 14 days

Request access by email to helpdesk

  • highmem - for high memory node access
  • gpu - for GPU node access
  • mpi - for running MPI jobs
  • critical - for running many jobs for clinical/mission-critical jobs

Tuning squeue Output

  • You can tune the output of many Slurm programs

  • For example squeue -o/--format.

  • See man squeue for details.

  • Example:

    $ squeue -o "%.10i %9P %60j %10u %.2t %.10M %.6D %.4C %20R %b" --user=holtgrem_c
        JOBID PARTITION NAME                                                         USER       ST       TIME  NODES CPUS NODELIST(REASON)     TRES_PER_NODE
      15382564 medium    bash                                                         holtgrem_c  R      30:25      1    1 hpc-cpu-63           N/A
  • Another goodie: pipe into less -S to get a horizontally scrollable output

Your turn 🤸: look at man squeue and find your “top 3 most useful” values.

Useful .bashrc Aliases

alias sbi='srun --pty --time 7-00 --mem=5G --cpus-per-task 2 bash -i'
alias slurm-loginx='srun --pty --time 7-00 --partition long --x11 bash -i'
alias sq='squeue -o "%.10i %9P %60j %10u %.2t %.10M %.6D %.4C %20R %b" "$@"'
alias sql='sq "$@" | less -S'
alias sqme='sq -u holtgrem_c "$@"'
alias sqmel='sqme "$@" | less -S'

The Role of Logging (1)

  • Main use case: ⚡⚡ troubleshooting ⚡⚡
  • While running
    • is my job stuck?
    • is something weird going on?
  • Post mortem analysis
    • why did my job fail?
    • what did it do?

The Role of Logging (2)

In Bash scripts, set -x and set -v are heaven-sent.

  • set -x prints commands before execution
  • (set -v prints commands as they are read)

Logging Pitfalls

  • Log files are written to disk and they are not flushed manually
    • You may thus not see the last lines of your log file!
  • If this is a problem, try using stdbuf in your script and manually redirect to a log file
    • But you probably should not do this by default!

Job Script Temporary File Handling

#!/usr/bin/bash

# ... prelude that does not need $TMPDIR ...

# Create new unique directory below current `$TMPDIR`.
export TMPDIR=$(mktemp -d)
# Setup auto-cleanup of the just-created `$TMPDIR`.
trap "rm -rf $TMPDIR" ERR EXIT

# ... your usual script ...

Submitting GPU Jobs

hpc-login-1$ export | grep CUDA_VISIBLE_DEVICES
hpc-login-1$ srun --partition=gpu --gres=gpu:tesla:1 --pty bash
hpc-gpu-1:~$ export | grep CUDA_VISIBLE_DEVICES
declare -x CUDA_VISIBLE_DEVICES="0"
hpc-gpu-1:~$ exit
res-login-1:~$ srun --partition=gpu --gres=gpu:tesla:2 --pty bash
hpc-gpu-2:~$ export | grep CUDA_VISIBLE_DEVICES
declare -x CUDA_VISIBLE_DEVICES="0,1"

👉 hpc-docs: How-To: Connect to GPU Nodes

Submitting High Memory Jobs (1)

  • The HPC has two nodes with 0.5TB and 1TB of RAM each.
  • To use these nodes, add --partition=highmem to sbatch/srun command line.
  • However:
    • Request access first by email to hpc-helpdesk@bih-charite.de
    • There are only four such nodes!

Submitting High Memory Jobs (2)

  • Can you reduce memory usage?
    • Split your problem into smaller ones?
    • Use more memory efficient data structures / algorithms?
    • E.g., Unix sort allows for external memory (aka disk) sorting and makes for super memory efficient sorting
    • Similarly, samtools sort allows fine-grained control over memory usage

Canceling Jobs with scancel

  • Jobs that are in running or pending state can be cancelled
  • For this: scancel JOBID
  • Cancel all your jobs: scancel --user=$USER
  • See scancel --help for more options

Your turn 🤸: submit a job, cancel it, look at scontrol and sacct output.

QOS and sacctmgr (1)

  • QOS = quality of service
  • Each partition has a QOS (with the same name) attached to it
  • This allows to restrict maximum wall time MaxWall
  • … and the maximum tracked resource per user MaxTRESPU
  • To see these values:
$ sacctmgr show qos -p | cut -d '|' -f 1,19,20 | column -s '|' -t
Name             MaxWall      MaxTRESPU
normal                        cpu=512,mem=3.50T
debug            01:00:00     cpu=1000,mem=7000G
medium           7-00:00:00   cpu=512,mem=3.50T
critical                      cpu=12000,mem=84000G
long             14-00:00:00  cpu=64,mem=448G
...

Job Dependencies with sbatch

  • Each Slurm job correspond to a tasks in task-based parallelism
  • We can model dependencies betwen jobs:
    • sbatch -d afterok:JOBID:JOBID:JOBID:...
    • sbatch -d afterany:JOBID:...
    • sbatch --job-name NAME -d singleton
  • More options / information: man sbatch

Your turn 🤸: write two jobs with -d afterok:JOBID

X11 Forwarding with Slurm

  • Sometimes you need to run a graphical application on the compute nodes

  • You must run a local X11 server (e.g., Linux, MobaXTerm)

  • Then, you can do:

    $ ssh -X hpc-login-1
    hpc-login-1$ srun --x11 --pty bash -i
    hpc-cpu-141$ xterm
  • … and see an XTerm window locally

Your turn 🤸: start xterm if you have an local X11 server.

Reservations

  • Reservations allow administrators to reserve resources for specific users.

  • For example, we have a reservation for this course:

    $ scontrol show reservation hpc-course
    ReservationName=hpc-course StartTime=2023-08-27T00:00:00 EndTime=2023-09-02T00:00:00 Duration=6-00:00:00
      Nodes=hpc-cpu-[76,80,191-194,201-206,208-228],hpc-gpu-[6-7] NodeCnt=35 CoreCnt=592 Features=(null) PartitionName=(null) Flags=IGNORE_JOBS,SPEC_NODES
      TRES=cpu=1184
      Users=holtgrem_c Groups=(null) Accounts=(null) Licenses=(null) State=INACTIVE BurstBuffer=(null) Watts=n/a
      MaxStartDelay=(null)
  • To use this reservation, add --reservation=hpc-course to your sbatch/srun command line.

Conda

  • Introduction
  • Installation
  • Managing environments and instaling software

Software Installation

A challenge? Some options:

What is Conda?

Conda is an open-source, cross-platform, language-agnostic package manager and environment management system.

– Wikipedia


Conda allows you to:

  • install many precompiled packages on your own
  • more than 8.5k bioinformatics packages via bioconda channel
  • manage distinct environment
    • e.g., separate by project if you want to pin versions

Plus, it integrates well into Snakemake (more about that later)

First Steps: Installation 🤸

Use the following steps for installation:

# on login node
srun --partition=training --mem=5G --pty bash -i

# on a compute node
wget -O /tmp/Miniforge3-Linux-x86_64.sh \
  https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-Linux-x86_64.sh
mkdir -p $HOME/work/miniconda3
ln -sr $HOME/work/miniconda3 $HOME/miniconda3
bash /tmp/Miniforge3-Linux-x86_64.sh -s -b -p $HOME/work/miniconda3

Configure:

conda config --add channels defaults
conda config --add channels bioconda
conda config --add channels conda-forge
conda config --set channel_priority strict
cat ~/.condarc

Now you can activate it with

source ~/miniconda3/bin/activate
mamba --help

👉 Mamba User Guide

Second Steps: Managing Environments

Creating an environment:

mamba create --yes --name read-mapping bwa samtools
conda activate read-mapping
## or: source ~/miniconda3/bin/activate read-mapping

Showing what is installed:

conda env export | tee env.yaml
# OUTPUT:
name: read-mapping
channels:
  - conda-forge
  - bioconda
  - defaults
dependencies:
  - _libgcc_mutex=0.1=conda_forge

Second Steps: Create New Environment

Second Steps: Installing Software

Apptainer

  • Introduction
  • Running .sif files with Apptainer
  • Building .sif files from Docker containers
  • Building .sif files from scratch

What is Apptainer?

Apptainer (fka Singularity) is a container system for HPC.

What are containers?

  • Package all software dependencies into one image file.
  • Run the software inside of the image.
  • Bind-mount directories into the container.
flowchart TD
    A[OS User Land] --> B[OS Kernel]
    C[Apptainer Layer] --> B
    D[Your app] --> A
    E[Your Container] --> C

➡️ Reproducible, transferrable, application installations

Preparations

  • Apptainer will download image files to ~/.apptainer
  • We should move this to somewhere with more space than $HOME
$ mkdir -p ~/work/.apptainer
$ ln -s ~/work/.apptainer ~/.apptainer
$ ls -lh ~/work | grep apptainer
lrwxrwxrwx  1 holtgrem_c hpc-ag-cubi   52 Apr 20 08:53 .apptainer -> /data/cephfs-1/home/users/holtgrem_c/work/.apptainer

Running

Before first run:

$ find ~/.apptainer/
/data/cephfs-1/home/users/holtgrem_c/.apptainer/

Running (will download and convert Docker image)

$ apptainer run docker://grycap/cowsay Hello World
INFO:    Converting OCI blobs to SIF format
INFO:    Starting build...
<<<many warnings>>>
INFO:    Creating SIF file...
 ___________________________
< To order, call toll-free. >
 ---------------------------
        \   ^__^
         \  (oo)\_______
            (__)\       )\/\
                ||----w |
                ||     ||

After run:

$ find ~/.apptainer/
$HOME/.apptainer/
...
$HOME/.apptainer/cache/blob/blobs/sha256/bcc2a6e8c5a73d8b8a4d1a75e68946d7c404b2f32b7574f6e5e0d571bf3537c1
...

Converting Docker Image to .sif Files

Build it:

$ apptainer build /tmp/cowsay.sif docker://grycap/cowsay
# INFO:    Starting build...
Getting image source signatures
Copying blob d6e911d60d73 skipped: already exists  
Copying blob 55010f332b04 skipped: already exists  
Copying blob b6f892c0043b skipped: already exists  
Copying blob 3deef3fcbd30 skipped: already exists  
Copying blob cf9722e506aa skipped: already exists  
Copying blob 2955fb827c94 skipped: already exists  
Copying config c1634cdfc2 done  
Writing manifest to image destination
<<<many warnings>>>
INFO:    Creating SIF file...
INFO:    Build complete: /tmp/cowsay.sif

Run it:

$ apptainer run /tmp/cowsay.sif 
 ______________________________________
/ Lazlo's Chinese Relativity Axiom:    \
|                                      |
| No matter how great your triumphs or |
| how tragic your defeats --           |
|                                      |
| approximately one billion Chinese    |
\ couldn't care less.                  /
 --------------------------------------
        \   ^__^
         \  (oo)\_______
            (__)\       )\/\
                ||----w |
                ||     ||

Cleanup:

$ rm -rf ~/.apptainer/*

Building .sif Files from Scratch

$ cat lolcow.def
Bootstrap: docker
From: ubuntu:16.04

%post
    apt-get -y update
    apt-get -y install fortune cowsay lolcat

%environment
    export LC_ALL=C
    export PATH=/usr/games:$PATH

%runscript
    fortune | cowsay | lolcat

Then, build in two-step process (so we don’t need sudo).

$ apptainer build --sandbox /tmp/lolcow lolcow.def
$ apptainer build lolcow.sif /tmp/lolcow
...
$ apptainer run lolcow.sif
 ________________________________________
/ Your temporary financial embarrassment \
| will be relieved in a surprising       |
\ manner.                                /
 ----------------------------------------
        \   ^__^
         \  (oo)\_______
            (__)\       )\/\
                ||----w |
                ||     ||

More here

Bring Your Own Project

🫵 Where can you apply what you have learned in your PhD project?

This is not the end…

… but all for this session

Recap

  • Slurm
    • Introduction
    • Interactive and Batch Jobs
    • Query slurm for job and cluster status
    • Troubleshooting! 😱 🥸 🤓
  • Conda
    • Installation and managing environments
  • Apptainer
    • Introduction
    • Building and Running Images